146 research outputs found

    Anytime planning for agent behaviour

    Get PDF
    For an agent to act successfully in a complex and dynamic environment (such as a computer game)it must have a method of generating future behaviour that meets the demands of its environment. One such method is anytime planning. This paper discusses the problems and benefits associated with making a planning system work under the anytime paradigm, and introduces Anytime-UMCP (A-UMCP), an anytime version of the UMCP hierarchical task network (HTN) planner [Erol, 1995]. It also covers the necessary abilities an agent must have in order to execute plans produced by an anytime hierarchical task network planner

    Exploring Design Space For An Integrated Intelligent System

    Get PDF
    Understanding the trade-offs available in the design space of intelligent systems is a major unaddressed element in the study of Artificial Intelligence. In this paper we approach this problem in two ways. First, we discuss the development of our integrated robotic system in terms of its trajectory through design space. Second, we demonstrate the practical implications of architectural design decisions by using this system as an experimental platform for comparing behaviourally similar yet architecturally different systems. The results of this show that our system occupies a "sweet spot" in design space in terms of the cost of moving information between processing components

    Active Inference for Integrated State-Estimation, Control, and Learning

    Full text link
    This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust behaviour compared to state-of-the-art methods. Additionally, we show the exact relationship to classic methods such as PID control. Finally, we show that by learning a temporal parameter and model variances, our approach can deal with unmodelled dynamics, damps oscillations, and is robust against disturbances and poor initial parameters. The approach is validated on the `Franka Emika Panda' 7 DoF manipulator.Comment: 7 pages, 6 figures, accepted for presentation at the International Conference on Robotics and Automation (ICRA) 202

    Effects of Training Data Variation and Temporal Representation in a QSR-Based Action Prediction System

    Get PDF
    Understanding of behaviour is a crucial skill for Artificial Intelligence systems expected to interact with external agents – whether other AI systems, or humans, in scenarios involving co-operation, such as domestic robots capable of helping out with household jobs, or disaster relief robots expected to collaborate and lend assistance to others. It is useful for such systems to be able to quickly learn and re-use models and skills in new situations. Our work centres around a behaviourlearning system utilising Qualitative Spatial Relations to lessen the amount of training data required by the system, and to aid generalisation. In this paper, we provide an analysis of the advantages provided to our system by the use of QSRs. We provide a comparison of a variety of machine learning techniques utilising both quantitative and qualitative representations, and show the effects of varying amounts of training data and temporal representations upon the system. The subject of our work is the game of simulated RoboCup Soccer Keepaway. Our results show that employing QSRs provides clear advantages in scenarios where training data is limited, and provides for better generalisation performance in classifiers. In addition, we show that adopting a qualitative representation of time can provide significant performance gains for QSR systems

    Bootstrapping Probabilistic Models of Qualitative Spatial Relations for Active Visual Object Search

    Get PDF
    In many real world applications, autonomous mobile robots are required to observe or retrieve objects in their environment, despite not having accurate estimates of the objects ’ locations. Finding objects in real-world settings is a non-trivial task, given the complexity and the dynamics of human environments. However, by understanding and exploiting the structure of such environments, e.g. where objects are commonly placed as part of everyday activities, robots can perform search tasks more efficiently and effectively than without such knowledge. In this paper we investigate how probabilistic models of qualitative spatial relations can improve the performance in object search tasks. Specifically, we learn Gaussian Mixture Models of spatial relations between object classes from descriptive statistics of real office environments. Experimental results with a range of sensor models suggest that our model improves overall performance in object search tasks.

    Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work

    Full text link
    Deep networks thrive when trained on large scale data collections. This has given ImageNet a central role in the development of deep architectures for visual object classification. However, ImageNet was created during a specific period in time, and as such it is prone to aging, as well as dataset bias issues. Moving beyond fixed training datasets will lead to more robust visual systems, especially when deployed on robots in new environments which must train on the objects they encounter there. To make this possible, it is important to break free from the need for manual annotators. Recent work has begun to investigate how to use the massive amount of images available on the Web in place of manual image annotations. We contribute to this research thread with two findings: (1) a study correlating a given level of noisily labels to the expected drop in accuracy, for two deep architectures, on two different types of noise, that clearly identifies GoogLeNet as a suitable architecture for learning from Web data; (2) a recipe for the creation of Web datasets with minimal noise and maximum visual variability, based on a visual and natural language processing concept expansion strategy. By combining these two results, we obtain a method for learning powerful deep object models automatically from the Web. We confirm the effectiveness of our approach through object categorization experiments using our Web-derived version of ImageNet on a popular robot vision benchmark database, and on a lifelong object discovery task on a mobile robot.Comment: 8 pages, 7 figures, 3 table

    Home alone: autonomous extension and correction of spatial representations

    Get PDF
    In this paper we present an account of the problems faced by a mobile robot given an incomplete tour of an unknown environment, and introduce a collection of techniques which can generate successful behaviour even in the presence of such problems. Underlying our approach is the principle that an autonomous system must be motivated to act to gather new knowledge, and to validate and correct existing knowledge. This principle is embodied in Dora, a mobile robot which features the aforementioned techniques: shared representations, non-monotonic reasoning, and goal generation and management. To demonstrate how well this collection of techniques work in real-world situations we present a comprehensive analysis of the Dora system’s performance over multiple tours in an indoor environment. In this analysis Dora successfully completed 18 of 21 attempted runs, with all but 3 of these successes requiring one or more of the integrated techniques to recover from problems

    One Risk to Rule Them All: Addressing Distributional Shift in Offline Reinforcement Learning via Risk-Aversion

    Full text link
    Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is not feasible. In such domains, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-averse. An additional challenge of offline RL is avoiding distributional shift, i.e. ensuring that state-action pairs visited by the policy remain near those in the dataset. Previous works on risk in offline RL combine offline RL techniques (to avoid distributional shift), with risk-sensitive RL algorithms (to achieve risk-aversion). In this work, we propose risk-aversion as a mechanism to jointly address both of these issues. We propose a model-based approach, and use an ensemble of models to estimate epistemic uncertainty, in addition to aleatoric uncertainty. We train a policy that is risk-averse, and avoids high uncertainty actions. Risk-aversion to epistemic uncertainty prevents distributional shift, as areas not covered by the dataset have high epistemic uncertainty. Risk-aversion to aleatoric uncertainty discourages actions that are inherently risky due to environment stochasticity. Thus, by only introducing risk-aversion, we avoid distributional shift in addition to achieving risk-aversion to aleatoric risk. Our algorithm, 1R2R, achieves strong performance on deterministic benchmarks, and outperforms existing approaches for risk-sensitive objectives in stochastic domains

    An Integrated Control Framework for Long-Term Autonomy in Mobile Service Robots

    Get PDF

    Convex Hull Monte-Carlo Tree Search

    Full text link
    This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we consider how to pose the problem of approximating multiobjective planning solutions as a contextual multi-armed bandits problem, giving a principled motivation for how to select actions from the view of contextual regret. This leads us to the use of Contextual Zooming for action selection, yielding Zooming CHMCTS. We evaluate our algorithm using the Generalised Deep Sea Treasure environment, demonstrating that Zooming CHMCTS can achieve a sublinear contextual regret and scales better than CHVI on a given computational budget.Comment: Camera-ready version of paper accepted to ICAPS 2020, along with relevant appendice
    corecore